Face Generation

In this project, you'll define and train a DCGAN on a dataset of faces. Your goal is to get a generator network to generate new images of faces that look as realistic as possible!

The project will be broken down into a series of tasks from loading in data to defining and training adversarial networks. At the end of the notebook, you'll be able to visualize the results of your trained Generator to see how it performs; your generated samples should look like fairly realistic faces with small amounts of noise.

Get the Data

You'll be using the CelebFaces Attributes Dataset (CelebA) to train your adversarial networks.

This dataset is more complex than the number datasets (like MNIST or SVHN) you've been working with, and so, you should prepare to define deeper networks and train them for a longer time to get good results. It is suggested that you utilize a GPU for training.

Pre-processed Data

Since the project's main focus is on building the GANs, we've done some of the pre-processing for you. Each of the CelebA images has been cropped to remove parts of the image that don't include a face, then resized down to 64x64x3 NumPy images. Some sample data is show below.

If you are working locally, you can download this data by clicking here

This is a zip file that you'll need to extract in the home directory of this notebook for further loading and processing. After extracting the data, you should be left with a directory of data processed_celeba_small/

In [7]:
# Define hyperparameters

n_epochs = 25
batch_size = 128

# I used the hyperparameters from the DCGAN paper
d_lr=0.0002
g_lr=0.00006

d_betas=(0.5, 0.999)
g_betas=(0.5, 0.999)

use_label_smoothing = False
In [8]:
# Inspired by https://stackoverflow.com/a/8856387/5353461
import inspect
import re

def describe(arg):
    frame = inspect.currentframe()
    callerframeinfo = inspect.getframeinfo(frame.f_back)
    try:
        context = inspect.getframeinfo(frame.f_back).code_context
        caller_lines = ''.join([line.strip() for line in context])
        m = re.search(r'describe\s*\((.+?)\)$', caller_lines)
        if m:
            caller_lines = m.group(1)
            position = str(callerframeinfo.filename) + "@" + str(callerframeinfo.lineno)

            # Add additional info such as array shape or string length
            additional = ''
            if hasattr(arg, "shape"):
                additional += "[shape={}]".format(arg.shape)
            elif hasattr(arg, "__len__"):  # shape includes length information
                additional += "[len={}]".format(len(arg))

            # Use str() representation if it is printable
            str_arg = str(arg)
            str_arg = str_arg if str_arg.isprintable() else repr(arg)

            print(position, "describe(" + caller_lines + ") = ", end='')
            print(arg.__class__.__name__ + "(" + str_arg + ")", additional)
        else:
            print("Describe: couldn't find caller context")

    finally:
        del frame
        del callerframeinfo
In [9]:
# can comment out after executing
# !unzip processed_celeba_small.zip
In [10]:
data_dir = 'processed_celeba_small/'

"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import pickle as pkl
import matplotlib.pyplot as plt
import numpy as np
import problem_unittests as tests
#import helper

%matplotlib inline

Visualize the CelebA Data

The CelebA dataset contains over 200,000 celebrity images with annotations. Since you're going to be generating faces, you won't need the annotations, you'll only need the images. Note that these are color images with 3 color channels (RGB)#RGB_Images) each.

Pre-process and Load the Data

Since the project's main focus is on building the GANs, we've done some of the pre-processing for you. Each of the CelebA images has been cropped to remove parts of the image that don't include a face, then resized down to 64x64x3 NumPy images. This pre-processed dataset is a smaller subset of the very large CelebA data.

There are a few other steps that you'll need to transform this data and create a DataLoader.

Exercise: Complete the following get_dataloader function, such that it satisfies these requirements:

  • Your images should be square, Tensor images of size image_size x image_size in the x and y dimension.
  • Your function should return a DataLoader that shuffles and batches these Tensor images.

ImageFolder

To create a dataset given a directory of images, it's recommended that you use PyTorch's ImageFolder wrapper, with a root directory processed_celeba_small/ and data transformation passed in.

In [11]:
# necessary imports
import torch
from torchvision import datasets
from torchvision import transforms

torch.manual_seed(0);  # For reproducability
In [12]:
def get_dataloader(batch_size, image_size, data_dir='processed_celeba_small/'):
    """
    Batch the neural network data using DataLoader
    :param batch_size: The size of each batch; the number of images in a batch
    :param img_size: The square size of the image data (x, y)
    :param data_dir: Directory where image data is located
    :return: DataLoader with batched data
    """
    
    transform = transforms.Compose([
                                    transforms.Resize(image_size),
                                    transforms.CenterCrop(image_size),
                                    transforms.ToTensor(),
                            #         transforms.Normalize(mean=[0.485, 0.456, 0.406],
                            #                               std=[0.229, 0.224, 0.225]),        
                                   ])
    
    dataset = datasets.ImageFolder(data_dir, transform=transform)
    loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)

    return loader

Create a DataLoader

Exercise: Create a DataLoader celeba_train_loader with appropriate hyperparameters.

Call the above function and create a dataloader to view images.

  • You can decide on any reasonable batch_size parameter
  • Your image_size must be 32. Resizing the data to a smaller size will make for faster training, while still creating convincing images of faces!
In [13]:
img_size = 32  # Size of both x and y dimensions
image_channels = 3

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
# Call your function and get a dataloader
celeba_train_loader = get_dataloader(batch_size, img_size)

Next, you can view some images! You should seen square images of somewhat-centered faces.

Note: You'll need to convert the Tensor images into a NumPy type and transpose the dimensions to correctly display an image, suggested imshow code is below, but it may not be perfect.

In [14]:
# helper display function
def imshow(img):
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
# obtain one batch of training images
dataiter = iter(celeba_train_loader)
images, _ = dataiter.next() # _ for no labels

# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(16, 4))
plot_size=16
for idx in np.arange(plot_size):
    ax = fig.add_subplot(2, plot_size/2, idx+1, xticks=[], yticks=[])
    imshow(images[idx])

Exercise: Pre-process your image data and scale it to a pixel range of -1 to 1

You need to do a bit of pre-processing; you know that the output of a tanh activated generator will contain pixel values in a range from -1 to 1, and so, we need to rescale our training images to a range of -1 to 1. (Right now, they are in a range from 0-1.)

In [15]:
# TODO: Complete the scale function
def scale(x, feature_range=(-1, 1)):
    ''' Scale takes in an image x and returns that image, scaled
       with a feature_range of pixel values from -1 to 1. 
       This function assumes that the input x is already scaled from 0-1.'''
    # assume x is scaled to (0, 1)
    
    min, max = feature_range
    
    x = min + x * (max - min)
    return x
In [16]:
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
# check scaled range
# should be close to -1 to 1
img = images[0]
scaled_img = scale(img)

print('Min: ', scaled_img.min())
print('Max: ', scaled_img.max())
Min:  tensor(-0.9294)
Max:  tensor(1.)

Define the Model

A GAN is comprised of two adversarial networks, a discriminator and a generator.

Discriminator

Your first task will be to define the discriminator. This is a convolutional classifier like you've built before, only without any maxpooling layers. To deal with this complex data, it's suggested you use a deep network with normalization. You are also allowed to create any helper functions that may be useful.

Exercise: Complete the Discriminator class

  • The inputs to the discriminator are 32x32x3 tensor images
  • The output should be a single value that will indicate whether a given image is real or fake
In [17]:
import torch.nn as nn
import torch.nn.functional as F
In [18]:
# helper conv function
def conv_block(in_channels, out_channels, kernel_size=4, stride=2, padding=1, batch_norm=True, transpose=False):
    """
    Creates a convolutional block, with optional batch normalization.
    :param in_channels:  input volume depth
    :param out_channels: output volume depth
    :param batch_norm:   False for no batch norm
    :param transpose:    True for fractional convolution
    """

    layers = []
    
    if transpose:
        conv_module = nn.ConvTranspose2d
    else:
        conv_module = nn.Conv2d

    conv_layer = conv_module(in_channels, out_channels, 
                             kernel_size, stride, padding, bias=(not batch_norm))
    
    # append conv layer
    layers.append(conv_layer)

    if batch_norm:
        # append batchnorm layer
        layers.append(nn.BatchNorm2d(out_channels))
     
    # using Sequential container
    return nn.Sequential(*layers)
In [19]:
class Discriminator(nn.Module):
   
    def __init__(self, initial_channels=32):
        """
        Initialize the Discriminator Module
        :param initial_channels: The depth of the first convolutional layer
        
        initial_channels depth is progressively doubled by each following layer
        No batch orm on the first layer
        """
        super(Discriminator, self).__init__()

        # Usage of dropout in convolutional GANs with batch norm?
        # https://stats.stackexchange.com/q/401962/162527
        # self.dropout = nn.Dropout(p=0.3) 
        
        num_conv_layers = 3
        
        # Get # of channels of each conv block
        conv_channels = [initial_channels * 2**scaling for scaling in (range(num_conv_layers))]
        
        self.conv_layers = nn.ModuleList()
        self.conv_layers.append(conv_block(image_channels, conv_channels[0], batch_norm=False))
        for ch_in, ch_out in zip(conv_channels[:-1], conv_channels[1:]):  
            self.conv_layers.append(conv_block(ch_in, ch_out))

        self.conv_out_volume = conv_channels[-1] * (img_size // 2**(num_conv_layers)) ** 2    
        self.fc = nn.Linear(self.conv_out_volume, 1)
  
    def forward(self, x):
        """
        Forward propagation of the neural network
        :param x: The input to the neural network     
        :return: Discriminator logits; the output of the neural network
        """

        batches = x.shape[0]
        
        for conv in self.conv_layers:
            x = conv(x)
            x = F.leaky_relu(x, negative_slope=0.2)
        
        x = x.reshape(batches, -1)
        return self.fc(x)  # No sigmoid applied

    
# Test
d = Discriminator()
# print(d)
o = d(torch.rand(10, 3, 32, 32));
# describe(o.shape)
    
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
# Have to modify... tests require model on cpu...

tests.test_discriminator(Discriminator)
Tests Passed

Generator

The generator should upsample an input and generate a new image of the same size as our training data 32x32x3. This should be mostly transpose convolutional layers with normalization applied to the outputs.

Exercise: Complete the Generator class

  • The inputs to the generator are vectors of some length z_size
  • The output should be a image of shape 32x32x3
In [20]:
class Generator(nn.Module):
    
    def __init__(self, z_size, out_dim=32):
        """
        Initialize the Generator Module
        :param z_size: The length of the input latent vector, z
        :param out_dim: The layers of the last, *non-output* convolutional layer
        """

        super(Generator, self).__init__()
        
        self.initial_dim = 4  # Size of both x and y dimension of first conv volume 
        
        num_conv_layers = 3  # Includes the final layer which outputs image_channels # of channels       
        self.initial_channels = out_dim * 2**(num_conv_layers-1)
        self.first_conv_volume = self.initial_dim**2 * self.initial_channels
        
        # Convert latent space to small spatial extent convolutional representation with many feature maps
        self.fc = nn.Linear(z_size, self.first_conv_volume)
        
        # Get # of channels of *non-output* conv blocks
        conv_channels = [out_dim * 2**scaling for scaling in reversed(range(num_conv_layers))]
        
        self.conv_layers = nn.ModuleList()
        for ch_in, ch_out in zip(conv_channels[:-1], conv_channels[1:]):  
            self.conv_layers.append(conv_block(ch_in, ch_out, transpose=True))

        # Output conv layer has no batch norm
        self.conv_layers.append(conv_block(conv_channels[-1], image_channels, batch_norm=False, transpose=True))
            
    def forward(self, x):
        """
        Forward propagation of the neural network
        :param x: The input to the neural network     
        :return: A 32x32x3 Tensor image as output
        """

        x = self.fc(x)
        x = x.reshape(-1, self.initial_channels, self.initial_dim, self.initial_dim)
        for conv_layer in self.conv_layers[:-1]:  # All but last conv block
            x = conv_layer(x)
            x = F.leaky_relu(x, negative_slope=0.2)
    
        x = self.conv_layers[-1](x)  # Final layer
        return torch.tanh(x)
    
# Test
g = Generator(100)
out = g(torch.rand(10, 100))
# print(g)
# describe(out.shape)

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_generator(Generator)
Tests Passed

Initialize the weights of your networks

To help your models converge, you should initialize the weights of the convolutional and linear layers in your model. From reading the original DCGAN paper, they say:

All weights were initialized from a zero-centered Normal distribution with standard deviation 0.02.

So, your next task will be to define a weight initialization function that does just this!

You can refer back to the lesson on weight initialization or even consult existing model code, such as that from the networks.py file in CycleGAN Github repository to help you complete this function.

Exercise: Complete the weight initialization function

  • This should initialize only convolutional and linear layers
  • Initialize the weights to a normal distribution, centered around 0, with a standard deviation of 0.02.
  • The bias terms, if they exist, may be left alone or set to 0.
In [21]:
from torch.nn import init

def weights_init_normal(m, mean=0, std=0.02):
    """
    Applies initial weights to certain layers in a model .
    The weights are taken from a normal distribution 
    with mean = 0, std dev = 0.02.
    :param m: A module or layer in a network    
    """

    classname = m.__class__.__name__
    
    if hasattr(m, 'weight') and ('Conv' in classname or 'Linear' in classname):
        init.normal_(m.weight.data, mean, std)
        if hasattr(m, 'bias') and m.bias is not None:
            init.constant_(m.bias.data, 0.0)
    elif 'BatchNorm2d' in classname:
        init.normal_(m.weight.data, 1.0, std)
        init.constant_(m.bias.data, 0.0)

g.apply(weights_init_normal);  # apply weights_init_normal function

Build complete network

Define your models' hyperparameters and instantiate the discriminator and generator from the classes defined above. Make sure you've passed in the correct input arguments.

In [22]:
"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
def build_network(d_conv_dim, g_conv_dim, z_size, verbose=True):
    # define discriminator and generator
    D = Discriminator(d_conv_dim)
    G = Generator(z_size=z_size, out_dim=g_conv_dim)

    # initialize model weights
    D.apply(weights_init_normal)
    G.apply(weights_init_normal)

    if verbose:
        print(D)
        print()
        print(G)
    
    return D, G

Exercise: Define model hyperparameters

In [23]:
# Define model hyperparams
d_conv_dim = 32
g_conv_dim = 32
z_size = 100

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
D, G = build_network(d_conv_dim, g_conv_dim, z_size)
Discriminator(
  (conv_layers): ModuleList(
    (0): Sequential(
      (0): Conv2d(3, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    )
    (1): Sequential(
      (0): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (2): Sequential(
      (0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (fc): Linear(in_features=2048, out_features=1, bias=True)
)

Generator(
  (fc): Linear(in_features=100, out_features=2048, bias=True)
  (conv_layers): ModuleList(
    (0): Sequential(
      (0): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): Sequential(
      (0): ConvTranspose2d(64, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (2): Sequential(
      (0): ConvTranspose2d(32, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    )
  )
)

Training on GPU

Check if you can train on GPU. Here, we'll set this as a boolean variable train_on_gpu. Later, you'll be responsible for making sure that

  • Models,
  • Model inputs, and
  • Loss function arguments

Are moved to GPU, where appropriate.

In [24]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import torch

# Check for a GPU
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
    print('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Training on GPU!')
    
device = 'cuda' if torch.cuda.is_available() else 'cpu'

D.to(device)
G.to(device);
Training on GPU!

Discriminator and Generator Losses

Now we need to calculate the losses for both types of adversarial networks.

Discriminator Losses

  • For the discriminator, the total loss is the sum of the losses for real and fake images, d_loss = d_real_loss + d_fake_loss.
  • Remember that we want the discriminator to output 1 for real images and 0 for fake images, so we need to set up the losses to reflect that.

Generator Loss

The generator loss will look similar only with flipped labels. The generator's goal is to get the discriminator to think its generated images are real.

Exercise: Complete real and fake loss functions

You may choose to use either cross entropy or a least squares error loss to complete the following real_loss and fake_loss functions.

In [25]:
def real_loss(D_out, smooth=False):
    '''Calculates how close discriminator outputs are to being real.
   param, D_out: discriminator logits
   param, smooth: value to assign to labels, or 0.9 if True
   return: real loss'''

    batch_size = D_out.shape[0]
    # label smoothing
    if smooth:  # smooth, real labels = 0.9 or as given by smooth parameter
        labels = torch.ones(batch_size)*0.9
    else:
        labels = torch.ones(batch_size) # real labels = 1
    # move labels to GPU if available     
    if train_on_gpu:
        labels = labels.cuda()
    # binary cross entropy with logits loss
    criterion = nn.BCEWithLogitsLoss()
    # calculate loss
    loss = criterion(D_out.squeeze(), labels)
    return loss

def fake_loss(D_out):
    '''Calculates how close discriminator outputs are to being fake.
   param, D_out: discriminator logits
   return: fake loss'''

    batch_size = D_out.shape[0]
    labels = torch.zeros(batch_size) # fake labels = 0
    if train_on_gpu:
        labels = labels.cuda()
    criterion = nn.BCEWithLogitsLoss()
    # calculate loss
    loss = criterion(D_out.squeeze(), labels)
    return loss

Optimizers

Exercise: Define optimizers for your Discriminator (D) and Generator (G)

Define optimizers for your models with appropriate hyperparameters.

In [26]:
import torch.optim as optim

d_optimizer = optim.Adam(D.parameters(), lr=d_lr, betas=d_betas)
g_optimizer = optim.Adam(G.parameters(), lr=g_lr, betas=g_betas)

Training

Training will involve alternating between training the discriminator and the generator. You'll use your functions real_loss and fake_loss to help you calculate the discriminator losses.

  • You should train the discriminator by alternating on real and fake images
  • Then the generator, which tries to trick the discriminator and should have an opposing loss function

Saving Samples

You've been given some code to print out some loss statistics and save some generated "fake" samples.

Exercise: Complete the training function

Keep in mind that, if you've moved your models to GPU, you'll also have to move any model inputs to GPU.

In [24]:
# Keep track of loss and generated, "fake" samples
# Global so that training can be interrupted and stats still obtained
samples = []
losses = []

def train(D, G, n_epochs, print_every=50):
    '''Trains adversarial networks for some number of epochs
       param, D: the discriminator network
       param, G: the generator network
       param, n_epochs: number of epochs to train for
       param, print_every: when to print and record the models' losses
       return: D and G losses'''
    
    # Get some fixed data for sampling. These are images that are held
    # constant throughout training, and allow us to inspect the model's performance
    sample_size = 16
    fixed_z = torch.Tensor(sample_size, z_size).uniform_(-1, 1).to(device)

    printings = 0  ########## for debug
    # epoch training loop
    for epoch in range(n_epochs):

        # batch training loop
        for batch_i, (real_images, _) in enumerate(celeba_train_loader):

            batch_size = real_images.shape[0]
            real_images = scale(real_images.to(device))

            # =============
            # Descriminator
            # =============
            d_optimizer.zero_grad()
            
            real_losses = real_loss(D(real_images), smooth=use_label_smoothing)
            
            z = torch.Tensor(sample_size, z_size).uniform_(-1, 1).to(device)
            fake_images = G(z)
            fake_losses = fake_loss(D(fake_images))
            
            d_loss = real_losses + fake_losses

            # Backprop
            d_loss.backward()
            d_optimizer.step()

            # =============
            # Generator
            # =============
            g_optimizer.zero_grad()
            
            z = torch.Tensor(sample_size, z_size).uniform_(-1, 1).to(device)
            fake_images = G(z)
            
            g_loss = real_loss(D(fake_images), smooth=False)
            
            # Backprop
            g_loss.backward()
            g_optimizer.step()

            
            # ===============================================
            #              END OF YOUR CODE
            # ===============================================

            # Print some loss stats
            if batch_i % print_every == 0:
                printings += 1
                # append discriminator loss and generator loss
                losses.append((d_loss.item(), g_loss.item()))
                # print discriminator and generator loss
#                 print('Epoch [{:5d}/{:5d}] | d_loss: {:6.4f} | g_loss: {:6.4f}'.format(
#                         epoch+1, n_epochs, d_loss.item(), g_loss.item()))
                print('{:3d} Epoch [{:5d}/{:5d}] | d_loss: {:6.4f} | g_loss: {:6.4f}'.format(
                    printings, epoch+1, n_epochs, d_loss.item(), g_loss.item()))
    
                G.eval() # for generating samples
                samples_z = G(fixed_z)
                samples.append(samples_z)
                G.train() # back to training mode
                
                view_samples(-1, samples);



        ## AFTER EACH EPOCH##    
        # this code assumes your generator is named G, feel free to change the name
        # generate and save sample, fake images
        G.eval() # for generating samples
        samples_z = G(fixed_z)
        samples.append(samples_z)
        G.train() # back to training mode

    # finally return losses
    return losses

Set your number of training epochs and train your GAN!

In [25]:
# Training
from workspace_utils import active_session

with active_session():
    losses = train(D, G, n_epochs=n_epochs, print_every=400)
  1 Epoch [    1/   25] | d_loss: 1.6476 | g_loss: 0.6420
  2 Epoch [    1/   25] | d_loss: 0.0353 | g_loss: 5.4191
  3 Epoch [    2/   25] | d_loss: 0.2994 | g_loss: 2.5662
  4 Epoch [    2/   25] | d_loss: 0.0875 | g_loss: 3.1234
  5 Epoch [    3/   25] | d_loss: 0.1528 | g_loss: 3.3711
  6 Epoch [    3/   25] | d_loss: 0.2329 | g_loss: 1.5535
  7 Epoch [    4/   25] | d_loss: 0.2977 | g_loss: 2.4964
  8 Epoch [    4/   25] | d_loss: 0.5837 | g_loss: 1.2303
  9 Epoch [    5/   25] | d_loss: 1.1419 | g_loss: 0.8155
 10 Epoch [    5/   25] | d_loss: 0.3628 | g_loss: 4.5634
 11 Epoch [    6/   25] | d_loss: 0.0913 | g_loss: 4.1997
 12 Epoch [    6/   25] | d_loss: 0.0911 | g_loss: 3.6185
 13 Epoch [    7/   25] | d_loss: 0.6140 | g_loss: 2.6411
 14 Epoch [    7/   25] | d_loss: 0.1730 | g_loss: 6.3415
 15 Epoch [    8/   25] | d_loss: 0.1226 | g_loss: 2.9213
 16 Epoch [    8/   25] | d_loss: 0.2174 | g_loss: 3.4439
 17 Epoch [    9/   25] | d_loss: 0.1468 | g_loss: 2.4701
 18 Epoch [    9/   25] | d_loss: 0.4165 | g_loss: 4.5313
 19 Epoch [   10/   25] | d_loss: 0.0565 | g_loss: 1.3238
 20 Epoch [   10/   25] | d_loss: 0.1519 | g_loss: 2.6434
 21 Epoch [   11/   25] | d_loss: 0.2163 | g_loss: 3.7406
/opt/conda/lib/python3.6/site-packages/matplotlib/pyplot.py:523: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
  max_open_warning, RuntimeWarning)
 22 Epoch [   11/   25] | d_loss: 0.2572 | g_loss: 4.0464
 23 Epoch [   12/   25] | d_loss: 0.0611 | g_loss: 2.2680
 24 Epoch [   12/   25] | d_loss: 0.2421 | g_loss: 4.9730
 25 Epoch [   13/   25] | d_loss: 0.0800 | g_loss: 5.4299
 26 Epoch [   13/   25] | d_loss: 0.2542 | g_loss: 1.6562
 27 Epoch [   14/   25] | d_loss: 0.0763 | g_loss: 4.8799
 28 Epoch [   14/   25] | d_loss: 0.0419 | g_loss: 1.9274
 29 Epoch [   15/   25] | d_loss: 0.0648 | g_loss: 1.9852
 30 Epoch [   15/   25] | d_loss: 0.0343 | g_loss: 8.5649
 31 Epoch [   16/   25] | d_loss: 0.2463 | g_loss: 2.7765
 32 Epoch [   16/   25] | d_loss: 0.1040 | g_loss: 2.6515
 33 Epoch [   17/   25] | d_loss: 0.0057 | g_loss: 4.9774
 34 Epoch [   17/   25] | d_loss: 0.0449 | g_loss: 6.0094
 35 Epoch [   18/   25] | d_loss: 0.0284 | g_loss: 4.1540
 36 Epoch [   18/   25] | d_loss: 0.1452 | g_loss: 5.8451
 37 Epoch [   19/   25] | d_loss: 0.1337 | g_loss: 3.2581
 38 Epoch [   19/   25] | d_loss: 0.0273 | g_loss: 5.1477
 39 Epoch [   20/   25] | d_loss: 0.1434 | g_loss: 3.3572
 40 Epoch [   20/   25] | d_loss: 0.0290 | g_loss: 6.4730
 41 Epoch [   21/   25] | d_loss: 0.0282 | g_loss: 5.8093
 42 Epoch [   21/   25] | d_loss: 0.2102 | g_loss: 7.4929
 43 Epoch [   22/   25] | d_loss: 0.9539 | g_loss: 1.6989
 44 Epoch [   22/   25] | d_loss: 0.0337 | g_loss: 6.1438
 45 Epoch [   23/   25] | d_loss: 0.0621 | g_loss: 4.2249
 46 Epoch [   23/   25] | d_loss: 0.5201 | g_loss: 2.3449
 47 Epoch [   24/   25] | d_loss: 0.0479 | g_loss: 6.2949
 48 Epoch [   24/   25] | d_loss: 0.0089 | g_loss: 5.5374
 49 Epoch [   25/   25] | d_loss: 0.0061 | g_loss: 5.4780
 50 Epoch [   25/   25] | d_loss: 0.1302 | g_loss: 6.9539

Save and load the model and sample images

In [27]:
save = False  # Only save if true
In [28]:
# Setup pathnames for save / restore
from datetime import datetime
timestamp = datetime.now().replace(microsecond=0).isoformat()

train_samples_path = f'train_samples_{timestamp}.pkl'
train_samples_latest_path = f'train_samples_latest.pkl'

gan_path = f'gan-{timestamp}.pth'
gan_latest_path = f'gan-latest.pth'
In [29]:
import os, tempfile, time

def symlink_force(target, link_name):
    '''
    Create a symbolic link pointing to target named link_name.
    Overwrite target if it exists.
    '''
    # Posted at:
    # https://stackoverflow.com/a/55741590/5353461 (A to my own Q)
    # https://stackoverflow.com/a/55742015/5353461 (A to Q with most upvotes)
    # https://stackoverflow.com/a/55741959/5353461 (A to: How to override bad symlink with python)

    # Possible race condition between temp link creation and overwriting link_name
    # https://bugs.python.org/issue36656 (python issue)

    # os.replace may fail if files are on different filesystems.
    # Therefore, use the directory of target
    link_dir = os.path.dirname(target)

    # os.symlink requires that the target does NOT exist.
    # Avoid race condition of file creation between mktemp and symlink:
    while True:
        temp_pathname = tempfile.mktemp(suffix='.tmp', \
                        prefix='symlink_force_tmp-', dir=link_dir)
        try:
            os.symlink(target, temp_pathname)
            break  # Success, exit loop
        except FileExistsError:
            time.sleep(0.001)  # Prevent high load in pathological conditions
        except:
            raise
    os.replace(temp_pathname, link_name)
In [30]:
# Save training generator samples
if save:
    with open(train_samples_path, 'wb') as f:
        pkl.dump(samples, f)

    # Create symlink to lastest saved version
    symlink_force(train_samples_path, train_samples_latest_path)

    print(f'Saved samples to {train_samples_path}, and linked to {train_samples_latest_path}')
In [31]:
# Save the model
if save:
    checkpoint = {'losses':       losses,
                  'd_state_dict': D.state_dict(),
                  'g_state_dict': G.state_dict(), 
                  'd_optimizer':  d_optimizer.state_dict(),
                  'g_optimizer':  d_optimizer.state_dict()}

    torch.save(checkpoint, gan_path)

    # Create symlink to lastest saved version
    symlink_force(gan_path, gan_latest_path)
    print(f'Saved checkpoint to {gan_path}, and linked to {gan_latest_path}')

Load the previously saved checkpoint and sample images

In [32]:
## Load checkpoint from latest path

load_path = gan_latest_path  # Allow for loading latest or dated pathname

D, G = build_network(d_conv_dim, g_conv_dim, z_size, verbose=False)

# Work around error:
# RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:51
if torch.cuda.is_available():
    map_location=lambda storage, loc: storage.cuda()
else:
    map_location='cpu'
    
checkpoint = torch.load(load_path, map_location=map_location)

losses = checkpoint['losses']
D.load_state_dict(checkpoint['d_state_dict'])
G.load_state_dict(checkpoint['g_state_dict'])
d_optimizer.load_state_dict(checkpoint['d_optimizer'])
g_optimizer.load_state_dict(checkpoint['g_optimizer'])
print(f'Loaded checkpoint from {load_path}')
Loaded checkpoint from gan-latest.pth
In [33]:
# Load samples from generator, taken while training
load_path = train_samples_latest_path  # Allow for loading latest or dated pathname

with open(load_path, 'rb') as f:
    samples = pkl.load(f)
    
print(f'Loaded training sample images from {load_path}')
Loaded training sample images from train_samples_latest.pkl

Training loss

Plot the training losses for the generator and discriminator, recorded after each epoch.

In [34]:
fig, ax = plt.subplots()
losses = np.array(losses)
plt.plot(losses.T[0], label='Discriminator', alpha=0.5)
plt.plot(losses.T[1], label='Generator', alpha=0.5)
plt.title("Training Losses")
plt.legend()
plt.show()

Generator samples from training

View samples of images from the generator, and answer a question about the strengths and weaknesses of your trained models.

In [35]:
# helper function for viewing a list of passed in sample images
def view_samples(epoch, samples):
#     fig, axes = plt.subplots(figsize=(16,4), nrows=2, ncols=8, sharey=True, sharex=True)
    fig, axes = plt.subplots(figsize=(8,8), nrows=4, ncols=4, sharey=True, sharex=True)
    for ax, img in zip(axes.flatten(), samples[epoch]):
        img = img.detach().cpu().numpy()
        img = np.transpose(img, (1, 2, 0))
        img = ((img + 1)*255 / (2)).astype(np.uint8)
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)
        im = ax.imshow(img.reshape((32,32,3)))
In [36]:
# Display latest samples
if samples:
    view_samples(-1, samples);

Question: What do you notice about your generated samples and how might you improve this model?

When you answer this question, consider the following factors:

  • The dataset is biased; it is made of "celebrity" faces that are mostly white
  • Model size; larger models have the opportunity to learn more features in a data feature space
  • Optimization strategy; optimizers and number of epochs affect your final result

Answer:

  • I'm not happy with the face at row 2, column 1.
  • The training set has the faces' chins cut off. I would prefer to generate a dataset which also took some pixels below the bottom of the face bounding box.
  • I would like to try another convolutional block in both discriminator and generator to see how increasing the complexity of the model affects the results.
  • I would like to train for more epochs, to see how face realism is affected with increased training. Given the subjective nature of this, perhaps I could find an external objective metric.
  • The resolution of 32x32 is low. The initial images are 64x64 but scaled down to 32x32. I'd like to scale out the convolutional layer's dimensions to the original 64x64 size, or even 128x128 (using interpolated upscaling on the dataset).
  • Better quality 32x32 images could possibly be obtained by generating 64x64 images and then downsampling
  • I would like to be able to find out which regions of latent space map to under-represented demographics (eg children) and how well the model performs in generating faces from those demographics.
  • Recognising that some demographics may have fewer examples than others, I would be interested in exploring weighting the loss funciton based on the number of samples.